Search CORE

12 research outputs found

3D Representation Learning for Shape Reconstruction and Understanding

Author: YU ZHENGDI
Publication venue
Publication date: 01/01/2023
Field of study

The real world we are living in is inherently composed of multiple 3D objects. However, most of the existing works in computer vision traditionally either focus on images or videos where the 3D information inevitably gets lost due to the camera projection. Traditional methods typically rely on hand-crafted algorithms and features with many constraints and geometric priors to understand the real world. However, following the trend of deep learning, there has been an exponential growth in the number of research works based on deep neural networks to learn 3D representations for complex shapes and scenes, which lead to many cutting-edged applications in augmented reality (AR), virtual reality (VR) and robotics as one of the most important directions for computer vision and computer graphics. This thesis aims to build an intelligent system with dynamic 3D representations that can change over time to understand and recover the real world with semantic, instance and geometric information and eventually bridge the gap between the real world and the digital world. As the first step towards the challenges, this thesis explores both explicit representations and implicit representations by explicitly addressing the existing open problems in these areas. This thesis starts from neural implicit representation learning on 3D scene representation learning and understanding and moves to a parametric model based explicit 3D reconstruction method. Extensive experimentation over various benchmarks on various domains demonstrates the superiority of our method against previous state-of-the-art approaches, enabling many applications in the real world. Based on the proposed methods and current observations of open problems, this thesis finally presents a comprehensive conclusion with potential future research directions

Durham e-Theses

$T\overline{T}$ deformation in SCFTs and integrable supersymmetric theories

Author: Ebert Stephen
Sun Hao-Yu
Sun Zhengdi
Publication venue
Publication date: 15/11/2020
Field of study

We calculate the

\mathcal{S}

-multiplets for two-dimensional Euclidean

\mathcal{N}=(0,2)

and

\mathcal{N} = (2,2)

superconformal field theories under the

T\overline{T}

deformation at leading order of perturbation theory in the deformation coupling. Then, from these

\mathcal{N} = (0, 2)

deformed multiplets, we calculate two- and three-point correlators. We show the

\mathcal{N} = (0,2)

chiral ring's elements do not flow under the

T\overline{T}

deformation. For the case of

\mathcal{N} = (2,2)

, we show the twisted chiral ring and chiral ring cease to exist simultaneously. Specializing to integrable supersymmetric seed theories, such as

\mathcal{N} = (2,2)

Landau-Ginzburg models, we use the thermodynamic Bethe ansatz to study the S-matrices and ground state energies. From both an S-matrix perspective and Melzer's folding prescription, we show that the deformed ground state energy obeys the inviscid Burgers' equation. Finally, we show that several indices independent of

D

-term perturbations including the Witten index, Cecotti-Fendley-Intriligator-Vafa index and elliptic genus do not flow under the

T\overline{T}

deformation.Comment: 46 page

arXiv.org e-Print Archive

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

Author: Birdal Tolga
Cheng Yongkang
Huang Shaoli
Yu Zhengdi
Publication venue
Publication date: 31/10/2023
Field of study

In this paper, we present SignAvatars, the first large-scale multi-prompt 3D sign language (SL) motion dataset designed to bridge the communication gap for hearing-impaired individuals. While there has been an exponentially growing number of research regarding digital communication, the majority of existing communication technologies primarily cater to spoken or written languages, instead of SL, the essential communication method for hearing-impaired communities. Existing SL datasets, dictionaries, and sign language production (SLP) methods are typically limited to 2D as the annotating 3D models and avatars for SL is usually an entirely manual and labor-intensive process conducted by SL experts, often resulting in unnatural avatars. In response to these challenges, we compile and curate the SignAvatars dataset, which comprises 70,000 videos from 153 signers, totaling 8.34 million frames, covering both isolated signs and continuous, co-articulated signs, with multiple prompts including HamNoSys, spoken language, and words. To yield 3D holistic annotations, including meshes and biomechanically-valid poses of body, hands, and face, as well as 2D and 3D keypoints, we introduce an automated annotation pipeline operating on our large corpus of SL videos. SignAvatars facilitates various tasks such as 3D sign language recognition (SLR) and the novel 3D SL production (SLP) from diverse inputs like text scripts, individual words, and HamNoSys notation. Hence, to evaluate the potential of SignAvatars, we further propose a unified benchmark of 3D SL holistic motion production. We believe that this work is a significant step forward towards bringing the digital world to the hearing-impaired communities. Our project page is at https://signavatars.github.io/Comment: 9 pages; Project page available at https://signavatars.github.io

arXiv.org e-Print Archive

Decomposed Human Motion Prior for Video Pose Estimation via Adversarial Training

Author: Chen Wenshuo
Gu Weixi
Yu Zhengdi
Zhang Kai
Zhou Xiang
Publication venue
Publication date: 24/09/2023
Field of study

Estimating human pose from video is a task that receives considerable attention due to its applicability in numerous 3D fields. The complexity of prior knowledge of human body movements poses a challenge to neural network models in the task of regressing keypoints. In this paper, we address this problem by incorporating motion prior in an adversarial way. Different from previous methods, we propose to decompose holistic motion prior to joint motion prior, making it easier for neural networks to learn from prior knowledge thereby boosting the performance on the task. We also utilize a novel regularization loss to balance accuracy and smoothness introduced by motion prior. Our method achieves 9\% lower PA-MPJPE and 29\% lower acceleration error than previous methods tested on 3DPW. The estimator proves its robustness by achieving impressive performance on in-the-wild dataset

arXiv.org e-Print Archive

U3DS $^3$ : Unsupervised 3D Semantic Scene Segmentation

Author: Breckon Toby P.
Liu Jiaxu
Shum Hubert P. H.
Yu Zhengdi
Publication venue
Publication date: 10/11/2023
Field of study

^3

, as a step towards completely unsupervised point cloud segmentation for any holistic 3D scenes. To achieve this, U3DS

^3

leverages a generalized unsupervised segmentation method for both object and background across both indoor and outdoor static 3D point clouds with no requirement for model pre-training, by leveraging only the inherent information of the point cloud to achieve full 3D scene segmentation. The initial step of our proposed approach involves generating superpoints based on the geometric characteristics of each scene. Subsequently, it undergoes a learning process through a spatial clustering-based methodology, followed by iterative training using pseudo-labels generated in accordance with the cluster centroids. Moreover, by leveraging the invariance and equivariance of the volumetric representations, we apply the geometric transformation on voxelized features to provide two sets of descriptors for robust representation learning. Finally, our evaluation provides state-of-the-art results on the ScanNet and SemanticKITTI, and competitive results on the S3DIS, benchmark datasets.Comment: 10 Pages, 4 figures, accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 202

arXiv.org e-Print Archive

The Hitchhiker's Guide to 4d $\mathcal{N}=2$ Superconformal Field Theories

Author: Akhond Mohammad
Arias-Tamargo Guillermo
Mininno Alessandro
Sun Hao-Yu
Sun Zhengdi
Wang Yifan
Xu Fengjun
Publication venue: 'Stichting SciPost'
Publication date: 21/08/2022
Field of study

Superconformal field theory with

\mathcal{N}=2

supersymmetry in four dimensional spacetime provides a prime playground to study strongly coupled phenomena in quantum field theory. Its rigid structure ensures valuable analytic control over non-perturbative effects, yet the theory is still flexible enough to incorporate a large landscape of quantum systems. Here we aim to offer a guidebook to fundamental features of the 4d

\mathcal{N}=2

superconformal field theories and basic tools to construct them in string/M-/F-theory. The content is based on a series of lectures at the Quantum Field Theories and Geometry School (https://sites.google.com/view/qftandgeometrysummerschool/home) in July 2020.Comment: v3: Improved discussion, fixed typos, added references v2: Typos fixed and added references. v1: 96 pages. Based on a series of lectures at the Quantum Field Theories and Geometry School in July 202

arXiv.org e-Print Archive

U3DS3 : Unsupervised 3D Semantic Scene Segmentation

Author: Breckon Toby P
Liu Jiaxu
Shum Hubert P H
Yu Zhengdi
Publication venue
Publication date: 24/10/2023
Field of study

Contemporary point cloud segmentation approaches largely rely on richly annotated 3D training data. However , it is both time-consuming and challenging to obtain consistently accurate annotations for such 3D scene data. Moreover, there is still a lack of investigation into fully un-supervised scene segmentation for point clouds, especially for holistic 3D scenes. This paper presents U3DS 3 , as a step towards completely unsupervised point cloud segmen-tation for any holistic 3D scenes. To achieve this, U3DS 3 leverages a generalized unsupervised segmentation method for both object and background across both indoor and outdoor static 3D point clouds with no requirement for model pre-training, by leveraging only the inherent information of the point cloud to achieve full 3D scene segmentation. The initial step of our proposed approach involves generating superpoints based on the geometric characteristics of each scene. Subsequently, it undergoes a learning process through a spatial clustering-based methodology, followed by iterative training using pseudo-labels generated in accordance with the cluster centroids. Moreover, by leverag-ing the invariance and equivariance of the volumetric representations , we apply the geometric transformation on vox-elized features to provide two sets of descriptors for robust representation learning. Finally, our evaluation provides state-of-the-art results on the ScanNet and SemanticKITTI, and competitive results on the S3DIS, benchmark datasets

Durham Research Online

P2-net: Joint description and detection of local features for pixel and point matching

Author: Chen Changhao
Cui Zhaopeng
Dong Zhen
Lu Chris Xiaoxuan
Markham Andrew
Qin Jie
Trigoni Niki
Wang Bing
Yu Zhengdi
Zhao Peijun
Zhu Fan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/07/2021
Field of study

Accurately describing and detecting 2D and 3D keypoints is crucial to establishing correspondences across images and point clouds. Despite a plethora of learning-based 2D or 3D local feature descriptors and detectors having been proposed, the derivation of a shared descriptor and joint keypoint detector that directly matches pixels and points remains under-explored by the community. This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds. In order to directly match pixels and points, a dual fully convolutional framework is presented that maps 2D and 3D inputs into a shared latent representation space to simultaneously describe and detect keypoints. Furthermore, an ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions. Extensive experimental results demonstrate that our framework shows competitive performance in fine-grained matching between images and point clouds and achieves state-of-the-art results for the task of indoor visual localization. Our source code will be available at [no-name-for-blind-review].Comment: ICCV 202

arXiv.org e-Print Archive

Edinburgh Research Explorer

Delay differential equations modeling of rumor propagation in both homogeneous and heterogeneous networks with a forced silence function

Author: Cai
Daley
Ding
Dong
Hua
Huo
Huo
Li
Li
Li
Li
Linhe Zhu
Liu
Liu
Liu
Liu
Liu
Lu
Maki
Miao
Shi
Shi
Trpevski
Wang
Wang
Wang
Wei
Wenshan Liu
Xia
Xu
Xue
Yu
Yu
Yu
Zanette
Zanette
Zhang
Zhang
Zhang
Zhao
Zhao
Zhengdi Zhang
Zhou
Zhu
Zhu
Zhu
Zhu
Zhu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref